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Abstract 

A finite  probabilistic  system  (FPS)  Is  a 
discrete-time  controlled  stochastic  process  hav- 
ing finite  Input,  output,  and  (Internal)  stace 
sets.  (A  partially-observed  Markov  decision  pro- 
cess Is  an  example  of  an  FPS).  It  may  be  viewed 
as  the  simplest  formulation  of  a nonlinear  esti- 
mation and  control  problem. 

Under  conditions  similar  to  observability 
and  controllability  in  linear  systems,  the  prob- 
lem of  selecting  inputs,  on  the  basis  of  past 
Inputs  and  outputs  (with  perfect  recall),  so  as 
to  maximize  a time-averaged  expected  reward.  Is 
shown  to  be  meaningful  as  the  horizon  increases 
without  bound  or  as  a discount  approaches  unity: 
an  optimal  strategy  exists;  it  may  be  realized 
by  a (strategy-independent)  state  estimator  along 
with  a stationary  policy  on  the  state  estimate; 
and  Its  performance  does  not  depend  on  Che  ini- 
tial state  of  Information. 

Dual  control  aspects  of  the  problem,  and 
potential  axtentlon  of  the  results  to  more  general 
svsceas  are  briefly  discussed. 


I . INTRODUCTION 


The  deceptive  simplicity  of  the  llnear-quadra- 
tlc-Gaussian  problem  formulation  and  solution  has 
been  articulated  by  Wltsenhausen  (18],  among  oth- 
ers. This  paper  describes  recent  work  (much  of 
which  was  originally  reported  In  the  author's  doc- 
toral dissertation  [11])  aimed  at  understanding 
the  relationship  between  estimation  and  control  in 
a more  general  setting.  Specifically,  It  examines 
a class  of  discrete-time  undiscounted  Infinite- 
horizon  stochastic  control  problems  In  which  the 
Input,  output,  and  state  seta  are  all  finite.  Con- 
ditions similar  to  control labt 1 lev  and  observabil- 
ity are  introduced  and  shown  to  imply  well-poscd- 
ness  of  Che  problem  in  the  following  sense:  The 
optimal  performance  converges  to  that  of  a sta- 
tionary policy  on  the  sufficient  statistic,  as  Che 
horizon  grows  without  bound  or  the  discount  ap- 
proaches unity. 


zon  stochastic  control.  Any  "dual  control"  prob- 
lem can'  be  slightly  modified  so  that  the  conditions 
described  above  are  satisfied.  On  the  other  hand, 
some  unmodified  "dual  control"  problems  are  mean- 
ingless unless  a finite  horizon  or  discount  rate 
Is  specified. 

Consider,  for  example,  a fair  coin  that  is 

tossed  at  times  k-0,1 The  outcome  of  toss 

k Is  denoted  s(k)-H  or  T.  Immediately  after  toss 
k>0,  an  experimenter  observes  v(k)  where 

1 0,  if  s(k-l)-s(k)  ) 

yOO  - > • 

f 1,  if  s(k-l)j<s(k)  ) 


The  experimenter  then  selects  an  input  from  the 
set  (H,T,B;.  The  object  Is  to  maximize  the  limit- 
ing frequency  of  correct  guesses  u(k)»s(k).  State 
Information  is  gained  by  selecting  u(k)”B,  which 
causes  a biased  coin  (e.g.  Pr{s(k+l)  Is  H}".6)  to 
be  used  In  toss  k+1. 

If  the  horizon  Is  finite  or  a discount  3 Is 
used,  then  che  problem  Is  well-posed;  the  biased 
coin  Is  used  during  a finite  Interval,  and  the 
most  likely  state  is  selected  thereafter.  As  the 
horizon  grows  without  bound  or  3fl,  the  limiting 
strategy  becomes:  u(k)«B,  Indefinitely.  Since 
there  will  be  no  guesses,  and  hence  no  correct 
guesses,  this  Is  the  worst  possible  strategy. 

An  optimal  strategy  is: 


| B,  If  k is  a power  of  2 ) 

I the  most  likely  state,  otherwise  I 


The  limiting  proportion  of  correct  guesses  Is  now 
1.  This  strategy  suffers  the  aesthetic  drawback 
of  being  nonstat ionary.  And  It  clearly  Is  not 
approached  as  the  horizon  grows  without  bound  or 
the  discount  approaches  unity.  For  these  reasons, 
the  problem  Is  considered  to  be  Ill-posed  In  the 
conventional  undiscounted  Infinite  horizon  formu- 
lation. 


This  approach  clarifies  che  concept  of  "dual 
control"  ]7,  ),  17]  la  undiscounted  Inf lnite-hori- 


The  problem  becomes  more  tractlble  if  we  add 
to  the  plant  model  a mechanism  whereby  observa- 
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cion  dynamics  fail  (in  a specifically  described 
manner,  e.g.  equally  likely  observation  of  0 or  1) 
with  probability  e,  with  0 < e « 1.  This  ver- 
sion of  Che  problem  has  a solution  chat  agrees 
with  LQG-induced  lncuitlon.  The  optimal  scracegy 
is  stationary,  and  alternates  between  measurement 
and  guessing  with  an  average  period  that  grows 
without  bound  as  £-*0.  Because  the  system  is  fal- 
lible, che  mathematics  of  optimization  will  not 
reach  into  che  arbitrarily  distant  past  for  in- 
formation thac  in  practice  would  surely  have  be- 
come noise-corrupted.  , 

This  paper  will  describe  conditions  that 
Imply  desirable  structural  properties  of  the  type 
discussed  above.  Results  are  stated  without 
proof;  for  details,  see  [11,12,13].  Our  presen- 
tation follows  a standard  plan: 

•Problem  Formulation:  Give  the  plant 
model  and  performance  criterion. 

•State  Estimation:  Derive  a recursive 
form  for  the  sufficient  statistic 
and  specify  a condition  for  sta- 
bility of  che  state  estimation 
process. 

•Dynamic  Programing  Formulation:  De- 
fine an  operator  whose  fixed 
point  is  the  solution  to  the 
lnflnlce  horizon  problem. 

•Fixed  Point  Theorem:  Prove  chat  the 
dynamic  programming  operator 
has  a unique  fixed  point. 

•Computational  Considerations:  Show 
how  an  £-optlmal  solution  can 
be  obtained  on  a digital  computer. 


II.  PROBLEM  FORMULATION 


is  a stochastic  matrix. 

The  dynamic  evolution  of  an  FPS  is  described 
in  che  following  terminology: 

1.  When  a decision-maker  specifies  input  u(k), 
that  input  is  said  to  be  accepted  by  the  FPS. 

Output  y(k+l)  is  subsequently  emitted  by  the  FPS. 

2.  Given  that  an  FPS  in  state  s(k)"i  accepts  in- 
put u(k)>u,  it  will  undergo  a transition  to  state 
s(k+l)«J  and  emit  output  y(k+l)"y,  with  (condi- 
tional) probability  Pij(y|u),  (conditionally)  in- 
dependently of  the  "past"  (s(k'),  u(k'), 

y(k,+l)}^7i0' 

3.  The  Markov  decision  process  (MDP)  consisting 
of  the  internal  state  and  input  processes  of  an 
FPS  is  called  the  underlying  process  (of  that  FPS). 
It  is  described  by  the  stochastic  matrices  (P(u): 
ueU}. 

4.  The  time  set  is  (0,  ...,  K}.  The  terminal  time 
K is  called  the  horizon. 

Remark.  This  notation  is  due  to  Paz  [10]. 

b)  The  probability  spaces 

An  FPS  is  studied  in  conjunction  with  an  Ini- 
tial state  probability  (ISP)  and  a control  strat- 
egy (CS). 

The  ISP,  denoted  by  w,  is  a stochastic  N-vec- 
tor  having  che  interpretation  ir^  - Pr{s(0)*i}. 

The  sec  of  ISP's  (l.e.  the  set  of  horizontal  sto- 
chastic N-vectors)  is  denoted  by  IT. 

* 

The  CS,  denoted  by  Y,  is  a mapping  Y:  Z ■*  U, 
where  1*  represents  the  free  monoid  generated  by 
UxY,  l.e.  che  set  of  finite  strings  of  I/O  pairs. 

A decision-maker  acting  according  to  Y selects 
Inputs 


a)  The  plant  model 


(2.1)  Definition.  A finite  probabilistic  (dyna- 
mical ) system  (FPS)  is  a 4-tuple  (U,  Y,  S, 
(P(y  u)  : yeY,  ueU})  where: 


(i)  U is  a finite  nonempty  sat  of 
input  values  (or  decisions); 

(ii)  Y is  a finite  nonempty  set  of 

output  values  (or  observations) ; 

(iii)  S - [1 N)  is  a finite 

nonempty  sec  of  (Internal) 
state  values; 

(lv)  Each  P(y|u)  is  an  NxN  substochasclc 
matrix  of  state  transition  prob- 
abilities. and 


(2.2) 


P<«*>  * £yEY  P(y|u) 


(2.3)  u(k)  - YU(k)] 


where  z(k)  is  the  information  vector 


(2.4)  z(k)  - (u(0),y(l))  (u(l) ,y(2) ) ••• 

(u(k-l).y(k)). 


The  set  of  OS's  (l.e.  the  set  of  mappings  from  Z 
to  U)  is  denoted  by  T. 

We  may  view  (s(k) ,u(k) ,y(k)}  as  random  var- 
iables on  a probability  space  P[w,y)  * (IT.  F, 

PrT  y)  where:  11  is  the  infinite  product  set  of 
SxUxY;  F is  the  0-algebra  generated  by  the  finite 
cylinders;  and  Prff  y is  determined  in  a straight- 
forward manner  from  the  transition  probabilities 
described  above. 

Eir  y will  denoce  the  expectation  operator 
associated  with  PrT  y. 


c)  The  performance  Indices 


Consider  a bounded  reel-valued  function  R on 
SxllxYxS,  and  define 

(2.5)  r(k)  - R[s(k),’  u(k),  y(k+l) , s(k+l)] 

(2.6)  g(K)  - K*1  r(k) 

(2.7)  g(6)  - (1-6)"1  Bk  r(k)  8<1 


(3.3)  Define  T(7,z)  ■ 7P(z)  / 7P(z)v, 
when  ttP(z)  i 0. 

(3.4)  Define  random  variables  on  P[w,y]: 

rAk)  - T (ir,  z(k)) 

Now  nW (k)  la  the  vector  of  conditional  state 
probabilities  at  time  k,  given  Inputs  and  outputs 
that  have  evolved  up  to  that  time.  It  may  be  com- 
puted by  the  (strategy-independent)  recursive 
formula 


We  call  r(k)  an  incremental  reward;  g(K)  Is  the 
time-averaged  reward,  and  g(9)  la  the  discount- 
averaged  reward.  Each  Is  a random  variable  on 

d)  Statement  of  the  problem 

The  problem  is  to  demonstrate  the  existence 
of  strategies  that  "optimize”  the  Infinite-horizon 
performance  indices  Hhk-m*  g(K)  and  limgfi  g(S). 
Specifically,  we  determine  conditions  that  assure 
the  existence  of  an  optimal  performance  g,  and  a 
family  {yw}  of  optimal  CS's  such  that,  for  all 
ISP's  7T  and  all  CS's  y. 


(3.5)  nlr(k)  - 

I »•  if  k"° 

{ T(n1T(k-l)  , (u(k-l)  ,y(k)))  , otherwise 
b)  A metric  on  II 

(3.6)  Definition  (Bayes'  operator).  For  irell, 
weRjj,  with  wj_>0  VieS  and  7w>0,  let  7»w 
denote  the  vector  in  IT  having  entries 

(7‘W^  " IT^W^/lTW. 

(3.7)  Definition.  For  ir.w'eH,  define 


(2.8) 


llV-E  w 

w.y 


(»(K)} 


tg(S)> 


(a)  I IT  - IT']  - XlcS  tir1  - ir’l ; 


(2.9)  lim  sup^E^^gdC)}  < g 


(b)  6 [tt ,tt * ] - 2ieSmax(TTi  - tt^,  0) 


(2.10)  lim  supstlE^Y{g(8)}  < g. 

e)  Bibliographic  notes 

Standard  references  on  the  role  of  MDP's  In 
stochastic  control  theory  are  Bertsekas  [4]  and 
Kushner  [9].  The  Partially  Observed  MDP  was  in- 
dependently conceived  by  Drake  [6]  and  Astrom 
[1,2].  Computational  algorithms  that  solve  fi- 
nite-horizon and  discounted  POMDP's  have  been 
given  by  Smallwood  and  Sondlk  [15]  and  Sondlk  [16]. 
A more  extensive  bibliography  may  be  found  in  [11, 
12,13]. 


(c)  Afif.ir']  ■ sup{6[ir«w,  ir'»w]: 

UCRN'  wi— ® ,1eS»  ™>0,  it'w>0}. 

(3.8)  Lemma.  ( • - • | , 6 and  A are  metrics  on 
H,  and 


0 £ 2 lw  “ it ' | - 5[ir,ir']  < A[tt,tt'  ] <_  1. 


(3.9)  Theorem  (evaluation  of  A).  For 
w.tr'en,  define: 


III.  THE  STATE  ESTIMATOR  FOR  FPS’s 


a)  The  recursive  formula 


c,  • min{ir^/TT1  : w^O}; 

Cj  - minfw^/wj  : ir^>0}. 


Let  us  introduce  some  terminology: 


Then 


(3.1)  For  z«(u1,y1)(u2,y2)  •••  (i^.y^eZ*.  define 
the  matrix  product  P(z)“P(y( |u()  • 

P(y,|u2)  ♦...•  ?(ykluk)- 

(3.2)  Define  Che  vertical  N-vector 

u - (1,  ...  , l)t. 


A [ IT  . TT  ' ] 


The  metric  6,  also  known  as  the  Hajnal  mea- 
sure, has  many  applications  In  the  theory  of 
ergodlc  Markov  chains  [8],  Informally,  5(7,1'] 
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is  the  minimal)  "quantity  of  probability"  that 
would  hav«  to  be  "reassigned"  in  order  to  trons- 
fora  probability  distribution  *r  into  probability 
distribution  s'.  Similarly,  A(it,it'  ] is  the  least 
upper  bound  on  the  quan  tlty  of  conditional  prob- 
ability by  which  s and  n'  might  differ  if  they 
were  conditioned  on  'identical  observations. 

The  distinction  between  S and  A is  also  illu- 
minated by  an  examination  of  the  topologies  they 
Induce  on  IT:  the  topology  Induced  by  5 is  con- 
nected, but  A causes  II  to  be  separated  into  2N-1 
"faces". 

c)  The  contraction  property  of  T 

It  is  well  known  that  if  P is  a stochastic 
matrix  and 

(3.10)  a[P ] - max1  JeS  ileS.  eJP]  < 1 
then,  for  any  s.ir'en. 


(3.11)  S[ttP,  s'P]  < a[P]  5(TT,it']. 


l.e.,  the  transformation  f[s)  ■ irP  is  i contrac- 
tion in  n.  One  consequence  of  this  property  of  P 
is  that  (s(P)n)  approaches  a unique  limit  as  rr**. 
The  rate  of  covergence  a(P]  is  called  the  ersodlc 
coefficient  of  the  stochastic  matrix  P. 

(3.12)  Definition:  If  P is  a nonzero  substo- 
chastlc  matrix,  then  define 

a(P]  - maxiA[T(el,P),  T(eJ,P)]  : 

e^M,  «JPt*0) 


Remark:  The  evaluation  of  a(P]  by  (3.9)  requires 
N-*  operations.  This  is  comparable  to 
the  effort  expended  when  multiplying  two 
N*N  matrices.  - 

The  generalized  ergodic  coefficient  a(P]  has 
the  following  properties: 

(3.13)  Lemma,  (a)  0 <_  a(P)  <_  1 for  ell  substo- 

chastlc  matrices  P^O. 

(b)  a(P)  < 1 P is  subrec- 

t angular*. 

(c)  a[Pl  * 0 <— •>  rank ( P | • l. 

(3.14)  Theorem.  (Contraction  Property  of  T) 

A[T(n.P).  T(n'.P)l  io(PI  Atn.n'I, 

nP^O,  n'PW. 

«•  

In  a subrectangular  matrix,  Pjj>0  and  P^’O 
Imply  Pln>0  and  Pjm>0. 


(3.15)  Corollary. 

a(P]  - sup  (AlT(n.P) , T(n',P)l  : 

nPi»0.  n'PW). 

(3.16)  Corollary.  a[PQ]  ^ a(P]  a[Q). 


d)  Another  metric  on 

With  Cj.Cj  as  in  (3.9),  define 

(3.17)  D[ir,s'l  - 1 - mintcj.cj. 

Now  D is  a metric  on  II  and  (1/4)  Dlt.ir']  £ 

A(ti,s']  ^ D(s,s']  £ 1.  It  has  the  following  re- 
markable property  (required  in  Theorem  (5.2)):  If 
v is  a convex  function  on  II  and  |v|  - sup-^'jfl 
(v(w)-v(Tr' ) ) then  | v(s)-v(s')  | <_  |v|  • D[s,s']. 

This  occurs  because  the  discontinuities*  of  A (dis- 
cussed in  section  3b)  coincide  with  the  potential 
discontinuities  of  a convex  function  on  H. 

e)  The  condition  on  observation  dynamics 

As  in  (3.12)  define 

(3.18)  a[P]  - max>4>(T(e*.P)  , T(eJ.P)): 

e‘p/0,  eJP*0) 

Now  consider  the  following  condition 


(3.19)  Condition  (detectability).  There  is  an 

a<l  and  an  Integer  £ such  that,  for  every 
ISP  it  and  every  CS  y: 


E.  (a[P(z<0)]>  < a. 

” * T 

Assuming  (3.19)  holds,  there  exists  an  a < a 
such  that 


(3.20)  y (a(P(z(C))])  < a. 

Using  the  recursion  (3.5)  and  the  contraction 
(3. 14) , we  obtain 

(3.21)  llm^E^^  {|n’r(k)-n’,'(k)|}  - 0 

v ’t.tt'cII.  yef. 


This  Is  analogous  to  convergence  of  the  condition- 
al state  distribution  (and  not  simplv  the  con- 
ditional mean!  to  an  initial-value-independent 

e 

with  respect  to  conventional  metrics  on  1. 
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trajectory  In  the  Kalman  filter. 

•in  FPS  oav  be  trivially  modified  so  that  Con- 
dition (3.19)  la  satisfied.  For  0 < C <<  1.  mul- 
tiply each  P(y!u)  by  l-£  and  then  add  c/ < # S'»Y)  to 
each  entrv  of  each  Ply  u).  This  quantity  may  be 
interpreted  as  the  probability  of  model  failure, 
as  discussed  In  Section  I. 


IV.  DYNAMIC  PROGRAMMING  FORMULATION 


Define: 

(9.1)  el  Is  the  "unit  vector”  In  II  vhoee  1-th 
entry  equals  unity. 

(9.2)  V Is  the  vector  space  of  real-valued 
bounded  continuous  functions  on  II. 

(9.3)  V • IvtV  : v(e**)»Of  C V. 

(9.9)  dcV  Is  the  "zero  function"  8(ii)“0, 

Vwe.1. 


Following  Aetroa  (1966), 

(9.10)  G (w)  - max  E v (g(K)j  - K_1  [fKe)  (it). 

fc  Y ”,Y 

Similarly,  using  the  contraction  property  of 
discounted  dynamic  programing  operators,  we  see 
that  fg  has  a unique  fixed  point  v^,  satisfying 

(4.11)  v*  - 11*^  (fg)Kv  V v £ V 

and 

(4.12)  C.  (n ) - max  Ew  v (g(6)}  - (1-8)  v$(it). 

This  last  equation  la  Justified  as  outlined  In 
Chapter  6 of  [4 ) . 

Both  and  Gg  are  known  to  be  convex  and 
continuous  on  H. 


(4.3)  q(u)  is  the  expected  Incremental  reward 

vector,  a vertical  N-vector  with  entries  V.  THE  FIXED  POINT  THEOREM 


l 


Vu)  " ^jts  >Y  pu (yiu) 

fjj  : V ♦ V is  th«  dltcounctd  dvruunlc 
proaraaalng  opirator 

[?  ,vl  (it)  - max  ...  (nq(u)  + 

6 UCU 

3 -vcY(ifP(y|u)v)  v(T(ir,(u.y)))). 

(4.7)  f : V -*  V Is  the  undiscounted  dynamic 
prograasalng  operator . given  by  f - I . 

(-.8)  f : V - V Is  the  normalized  (undiscounted) 
dynamic  programming  operator  given  by 

Ifvl  (t)  - ( fv  1 (IT)  - [fv]  (eM). 


Remark:  This  operator  corresponds  to  a value- 

iteration  algorithm  of  D.  J.  White  (19]. 

(4.9)  f\  : V -*  V Is  the  damped  normalised  (un- 
dlscounted)  dynamic  programming  operator 
given  by 


fvv  • X f v + ( 1 — V ) v 


Ue  now  require  a second  condition: 

(5.1)  Condition  (reachability).  There  is  a o<l 

and  an  integer  £ such  chat,  for  every  Tefl, 
JeS,  a sequence  of  inputs  Uj,  ...,  u- 
exlsts,  sstlsfylng 

1 - Elcs  VP(ul>  * •••  • -°- 

Also  define 


^max 

^min 


l£S 


min 


ICS 


max 


min 


uCU 


ucU 


Rjtu) 

9t(u) 


^ " ^max  ^min 
L (l-o) (1 -a) 


The  following  theorem  Is  che  main  result  of  this 
research. 

(5.2)  Theorem.  Assume  Conditions  (3.19)  and 

J.5.1).  Now,  for  anv  0<X<1,  the  sequence 
fvk9,  k-1,2.  ...  , converges  uniformly  to 
a function  v*  In  V having  the  following 
properties : 


. . - * * 

(1)  fv  - v 


Remark:  This  operator  corresponds  to  a value- 

iteration  algorithm  of  P.  J.  Schweitzer 
(14). 


(I*)  (equivalent  to  (D)  There  Is  a 
constant  g,  called  the  gain  or 
opt lmal  performance . such  that 
(fv«-v*](it)  • g,  V it  c n 


(11)  v la  convex 


(ill)  ,v*|  < C 

(iv)  v*(tr)  + K g - max^.^fv  (tr * ) > < 

[fK  8 ] (n)  < v*(n)  Hj- 

mlni,£.j{v*(ir')} 

(v)  v* (it ) + g/  Cl— 6)  - max^^tv  (w')}  <_ 
v*(u)  < v*(tt)  + g/(l-8)  - 

■in»'cll{w*<,,,>>- 

Now  (2.3),  (2.9),  (2.10)  arc  immediate  consequences 
os  (4.10),  (4.12),  and  (5.2). 


VI.  COMPUTATION  OF  AN  £ -OPTIMAL  CONTROLLER 


Condition  (3.19)  Implies  that  the  state  esti- 
mator can  be  arbitrarily  closely  approximated,  for 
M sufficiently  large, by  a finite-state  automaton 
that  retains  only  the  most  recent  M input-outpuc 
pairs.  (Compare  this  with  a similar  property  of 
the  stable  Kalman  filter). 

^-optimal  finite-memory  strategies  may  be  com- 
puted by  a method,  called  perceptive  dynamic  pro- 
gramming, based  on  this  property  of  the  state  esti- 
mator. Suppose  that  the  controller  is  able,  at 
time  k,  to  exactly  measure  the  internal  state  at 
time  k-M(k),  and  suppose  moreover  that  the  process 
M(k)  is  such  that  z (k.)  • [s(k-M(k));  (u(k-M(k)), 
y (k+l-M(k) ) , ...  , (u(k-l),  y(k))J  is  a sufficient 
statistic*.  Then  the  problem  can  be  expressed  as 
an  MDP  having  state  process  z(k);  this  is  a simple 
generalization  of  [5].  Of  course  the  resulting 
policy  depends  on  information  that  is  not  avail- 
able in  practice,  and  so  it  cannot  be  considered 
a solution  to  the  original  problem.  But  the  per- 
formance obtained  is  clearly  an  upper  bound  on 
feasible  performance,  since  it  assumes  the  avail- 
ability of  more  information.  Now  Che  delayed 
state  can  be  guessed  and  substituted  into  this 
policy.  The  resulting  controller  is  feasible 
(when  the  guess  s(k-M(k))  is  a function  of  the 
I/O  pairs  in  z(k))  and  the  closed  loop  system  is 
now  a Markov  chain  whose  performance  is  readily 
evaluated.  This  performance  is  a lower  bound  on 
optimal  feasible  performance.  It  can  be  shown 
[11]  that  the  difference  between  these  bounds 
approaches  zero  as  a lower  bound  on  M(k)  is  in- 
creased. This  algorithm  will  be  discussed  in  de- 
tail in  a later  publication. 


“or  example,  M(k)  might  be  a constant.  More 
generally,  it  suffices  thac  M(k+1)  be  expressed 
as  a (deterministic)  function  of  z(k),  u(k)  and 
>'(k*l)  alone,  and  that  M(k+l)  .<  M(k)  1. 


VII.  CONCLUSIONS 


A finite-element  plant  model  has  been  con- 
sidered and  controllabtlity/observability-llke 
conditions  have  been  shown  to  imply  well-posedness 
of  the  problem  in  the  infinite-horizon  case.  A 
key  concept  in  obtaining  these  results  was  a me- 
tric with  respect  to  which  the  state  estimator  is 
a contraction.  The  author  is  currently  Interested 
in  generalizing  this  metric  to  distributions  on 
infinite  state  sets  such  as  Euclidean  space  or  the 
unit  sphere.  In  the  case  of  a Kalman  filter,  the 
contraction,  in  order  to  be  analogous  with  what 
is  presented  here,  must  account  not  only  for  con- 
vergence of  the  conditional  mean,  but  for  conver- 
gence of  the  entire  distribution  to  a normal  dis- 
tribution with  appropriate  covariance  as  well. 
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